Skip to content

Mesh Pipelines#982

Open
GDBobby wants to merge 5 commits intoDevsh-Graphics-Programming:masterfrom
GDBobby:master
Open

Mesh Pipelines#982
GDBobby wants to merge 5 commits intoDevsh-Graphics-Programming:masterfrom
GDBobby:master

Conversation

@GDBobby
Copy link

@GDBobby GDBobby commented Jan 6, 2026

Mesh Pipelines

Description of changes that aren't implicit.

  1. Reordered every mention of Graphics, Compute, Mesh, and Raytracing pipelines to follow that order. For example, in line 331 of IGPUCommandBuffer, Compute was moved below Graphics, and mesh was put below compute, which is above raytracing.
  2. In ILogicalDevice, I abstracted the commonalities between Graphics and Mesh pipeline creation. The end result for Graphics doesn't change at all.
  3. I'll probably need help with the features and limits. I included the bare minimum.
  4. As requested, GetBoundGraphicsPipeline now returns IGPUPipelineBase, and both mesh and graphics bind as the RasterizationPipeline.

Testing

https://github.com/GDBobby/Nabla-Examples-and-Tests/tree/master/MeshShader

@devshgraphicsprogrammingjenkins
Copy link
Contributor

[CI]: Can one of the admins verify this patch?

@devshgraphicsprogramming
Copy link
Member

devshgraphicsprogramming commented Feb 23, 2026

@GDBobby there's a huge PR dropping this week #1000

After this there will be an example in CI mode that loads dozens of models and screenshots them

Let me know if thats something that will help you

Either way the branch will need to get master merged into it so the CI can run.

Comment on lines -588 to +596
const IGPUGraphicsPipeline* getBoundGraphicsPipeline() const { return m_boundGraphicsPipeline; }
const IGPUPipelineBase* getBoundGraphicsPipeline() const { return m_boundRasterizationPipeline; }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a RasterizationPipelineBase, cause PipelineBase is also inherited from by RT and compute

Comment on lines -1205 to +1334
void CVulkanLogicalDevice::createGraphicsPipelines_impl(
IGPUPipelineCache* const pipelineCache,
const std::span<const IGPUGraphicsPipeline::SCreationParams> createInfos,
core::smart_refctd_ptr<IGPUGraphicsPipeline>* const output,
const SSpecializationValidationResult& validation
)
{
auto getVkStencilOpStateFrom = [](const asset::SStencilOpParams& params)->VkStencilOpState
{
return {
.failOp = static_cast<VkStencilOp>(params.failOp),
.passOp = static_cast<VkStencilOp>(params.passOp),
.depthFailOp = static_cast<VkStencilOp>(params.depthFailOp),
.compareOp = static_cast<VkCompareOp>(params.compareOp)
};
void PopulateViewport(VkPipelineViewportStateCreateInfo& outViewport, nbl::asset::SRasterizationParams const& raster) {
outViewport.viewportCount = raster.viewportCount;
// must be identical to viewport count unless VK_DYNAMIC_STATE_VIEWPORT_WITH_COUNT_EXT or VK_DYNAMIC_STATE_SCISSOR_WITH_COUNT_EXT are used
outViewport.scissorCount = raster.viewportCount;
}


void PopulateRaster(VkPipelineRasterizationStateCreateInfo& outRaster, nbl::asset::SRasterizationParams const& raster) {
outRaster.depthClampEnable = raster.depthClampEnable;
outRaster.rasterizerDiscardEnable = raster.rasterizerDiscard;
outRaster.polygonMode = static_cast<VkPolygonMode>(raster.polygonMode);
outRaster.cullMode = static_cast<VkCullModeFlags>(raster.faceCullingMode);
outRaster.frontFace = raster.frontFaceIsCCW ? VK_FRONT_FACE_COUNTER_CLOCKWISE : VK_FRONT_FACE_CLOCKWISE;
outRaster.depthBiasEnable = raster.depthBiasEnable;
}

void PopulateMultisample(VkPipelineMultisampleStateCreateInfo& outMultisample, nbl::asset::SRasterizationParams const& raster) {
outMultisample.rasterizationSamples = static_cast<VkSampleCountFlagBits>(0x1 << raster.samplesLog2);
if (raster.minSampleShadingUnorm > 0) {
outMultisample.sampleShadingEnable = true;
outMultisample.minSampleShading = float(raster.minSampleShadingUnorm) / 255.f;
}
else {
outMultisample.sampleShadingEnable = false;
outMultisample.minSampleShading = 0.f;
}
outMultisample.pSampleMask = raster.sampleMask;
outMultisample.alphaToCoverageEnable = raster.alphaToCoverageEnable;
outMultisample.alphaToOneEnable = raster.alphaToOneEnable;
}
VkStencilOpState getVkStencilOpStateFrom(const asset::SStencilOpParams& params) {
return {
.failOp = static_cast<VkStencilOp>(params.failOp),
.passOp = static_cast<VkStencilOp>(params.passOp),
.depthFailOp = static_cast<VkStencilOp>(params.depthFailOp),
.compareOp = static_cast<VkCompareOp>(params.compareOp)
};
}

const auto& features = getEnabledFeatures();
void PopulateDepthStencil(VkPipelineDepthStencilStateCreateInfo& outDepthStencil, nbl::asset::SRasterizationParams const& raster) {
outDepthStencil.depthTestEnable = raster.depthTestEnable();
outDepthStencil.depthWriteEnable = raster.depthWriteEnable;
outDepthStencil.depthCompareOp = static_cast<VkCompareOp>(raster.depthCompareOp);
outDepthStencil.depthBoundsTestEnable = raster.depthBoundsTestEnable;
outDepthStencil.stencilTestEnable = raster.stencilTestEnable();
outDepthStencil.front = getVkStencilOpStateFrom(raster.frontStencilOps);
outDepthStencil.back = getVkStencilOpStateFrom(raster.backStencilOps);
}

void PopulateColorBlend(
VkPipelineColorBlendStateCreateInfo& outColorBlend,
VkPipelineColorBlendAttachmentState*& outColorBlendAttachmentState,
nbl::asset::SBlendParams const& blend,
nbl::asset::IRenderpass::SCreationParams::SSubpassDescription const& subpass
) {
//outColorBlend->flags no attachment order access yet
outColorBlend.logicOpEnable = blend.logicOp != asset::ELO_NO_OP;
outColorBlend.logicOp = getVkLogicOpFromLogicOp(blend.logicOp);
outColorBlend.pAttachments = outColorBlendAttachmentState;
for (auto i = 0; i < IGPURenderpass::SCreationParams::SSubpassDescription::MaxColorAttachments; i++) {
if (subpass.colorAttachments[i].render.used()) {
const auto& params = blend.blendParams[i];
outColorBlendAttachmentState->blendEnable = params.blendEnabled();
outColorBlendAttachmentState->srcColorBlendFactor = getVkBlendFactorFromBlendFactor(static_cast<asset::E_BLEND_FACTOR>(params.srcColorFactor));
outColorBlendAttachmentState->dstColorBlendFactor = getVkBlendFactorFromBlendFactor(static_cast<asset::E_BLEND_FACTOR>(params.dstColorFactor));
outColorBlendAttachmentState->colorBlendOp = getVkBlendOpFromBlendOp(static_cast<asset::E_BLEND_OP>(params.colorBlendOp));
outColorBlendAttachmentState->srcAlphaBlendFactor = getVkBlendFactorFromBlendFactor(static_cast<asset::E_BLEND_FACTOR>(params.srcAlphaFactor));
outColorBlendAttachmentState->dstAlphaBlendFactor = getVkBlendFactorFromBlendFactor(static_cast<asset::E_BLEND_FACTOR>(params.dstAlphaFactor));
outColorBlendAttachmentState->alphaBlendOp = getVkBlendOpFromBlendOp(static_cast<asset::E_BLEND_OP>(params.alphaBlendOp));
outColorBlendAttachmentState->colorWriteMask = getVkColorComponentFlagsFromColorWriteMask(params.colorWriteMask);
outColorBlendAttachmentState++;
//^that pointer iterator is how we ensure the attachments or consecutive
}
}
outColorBlend.attachmentCount = std::distance<const VkPipelineColorBlendAttachmentState*>(outColorBlend.pAttachments, outColorBlendAttachmentState);
}

template<typename SCreationParams>
void PopulateMeshGraphicsCommonData(
const std::span<const SCreationParams> createInfos,
core::vector<VkGraphicsPipelineCreateInfo>& vk_createInfos,

core::vector<VkPipelineViewportStateCreateInfo>& vk_viewportStates,
core::vector<VkPipelineRasterizationStateCreateInfo>& vk_rasterizationStates,
core::vector<VkPipelineMultisampleStateCreateInfo>& vk_multisampleStates,
core::vector<VkPipelineDepthStencilStateCreateInfo>& vk_depthStencilStates,
core::vector<VkPipelineColorBlendStateCreateInfo>& vk_colorBlendStates,
core::vector<VkPipelineColorBlendAttachmentState>& vk_colorBlendAttachmentStates,

core::vector<VkDynamicState>& vk_dynamicStates,
const VkPipelineDynamicStateCreateInfo& vk_dynamicStateCreateInfo
) {
//the main concern is lifetime, so don't want to construct, move, or copy anything in here

auto outColorBlendAttachmentState = vk_colorBlendAttachmentStates.data(); //the pointer iterator is used

core::vector<VkDynamicState> vk_dynamicStates = {

for (uint32_t i = 0; i < createInfos.size(); i++) { //whats the maximum number of pipelines that can be created at once? uint32_t to be safe
auto& info = createInfos[i];
const auto& blend = info.cached.blend;
const auto& raster = info.cached.rasterization;
const auto& subpass = info.renderpass->getCreationParameters().subpasses[info.cached.subpassIx];

initPipelineCreateInfo(&vk_createInfos[i], info);

PopulateViewport(vk_viewportStates[i], raster);
PopulateRaster(vk_rasterizationStates[i], raster);
PopulateMultisample(vk_multisampleStates[i], raster);
PopulateDepthStencil(vk_depthStencilStates[i], raster);
PopulateColorBlend(vk_colorBlendStates[i], outColorBlendAttachmentState, blend, subpass);
//PopulateDynamicState(dynState, ?)


vk_createInfos[i].pViewportState = &vk_viewportStates[i];
vk_createInfos[i].pRasterizationState = &vk_rasterizationStates[i];
vk_createInfos[i].pMultisampleState = &vk_multisampleStates[i];
vk_createInfos[i].pDepthStencilState = &vk_depthStencilStates[i];
vk_createInfos[i].pColorBlendState = &vk_colorBlendStates[i];
vk_createInfos[i].pDynamicState = &vk_dynamicStateCreateInfo;
vk_createInfos[i].renderPass = static_cast<const CVulkanRenderpass*>(info.renderpass)->getInternalObject();
vk_createInfos[i].subpass = info.cached.subpassIx;
//handle
//index
//layout?
// ^ handled in initPipelineCreateInfo
}
}

core::vector<VkDynamicState> getDefaultDynamicStates(SPhysicalDeviceFeatures const& features) {
core::vector<VkDynamicState> ret = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevyuu you review this and give it the okay

Copy link
Member

@devshgraphicsprogramming devshgraphicsprogramming Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually wait a ssecond, @GDBobby I think you could turn all these free floating functions into struct methods of a struct that has reference members like core::vector<>& or span members const std::span<> since the vectors are preallocated

Comment on lines +12 to +14
//potentially collapse this so Mesh just uses CVulkanGraphicsPipeline
//if thats done, BindMesh can go away
class CVulkanMeshPipeline final : public IGPUMeshPipeline

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you still need correct inheritance unfortunately, you could template the class though

template<typename Base> requires std::is_base_of_v<IGPURasterizationPipeline,Base>
class CVulkanRasterizationPipeline : public Base

Comment on lines +1873 to +1875
meshShaderFeatures.primitiveFragmentShadingRateMeshShader = VK_FALSE;//needs to be explicitly set?
meshShaderFeatures.meshShaderQueries = VK_FALSE;
meshShaderFeatures.multiviewMeshShader = VK_FALSE;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the question is, for regular Multiview, queries and Shading rate, do we currently have them as limits or features?

Comment on lines +826 to +863
//check if features are existant first
//potentially put a copy of VkPhysicalDeviceMeshShaderFeaturesEXT directly into features
//depends on the less obvious properties
if (isExtensionSupported(VK_EXT_MESH_SHADER_EXTENSION_NAME)) {
features.meshShader = meshShaderFeatures.meshShader;
features.taskShader = meshShaderFeatures.taskShader;
//TODO
//VkBool32 multiviewMeshShader;
//VkBool32 primitiveFragmentShadingRateMeshShader;
//VkBool32 meshShaderQueries;

//VkPhysicalDeviceMeshShaderPropertiesEXT
//#define LIMIT_INIT_MESH(limitMemberName) properties.limits.limitMemberName = meshShaderProperties.limitMemberName
//LIMIT_INIT_MESH(maxTaskWorkGroupTotalCount);
//LIMIT_INIT_MESH(maxTaskWorkGroupInvocations);
//LIMIT_INIT_MESH(maxTaskPayloadSize);
//LIMIT_INIT_MESH(maxTaskSharedMemorySize);
//LIMIT_INIT_MESH(maxTaskPayloadAndSharedMemorySize);
//LIMIT_INIT_MESH(maxMeshWorkGroupInvocations);
//LIMIT_INIT_MESH(maxMeshSharedMemorySize);
//LIMIT_INIT_MESH(maxMeshPayloadAndSharedMemorySize);
//LIMIT_INIT_MESH(maxMeshOutputMemorySize);
//LIMIT_INIT_MESH(maxMeshOutputComponents);
//LIMIT_INIT_MESH(maxMeshOutputVertices);
//LIMIT_INIT_MESH(maxMeshOutputPrimitives);
//LIMIT_INIT_MESH(maxMeshOutputLayers);
//LIMIT_INIT_MESH(maxMeshMultiviewViewCount);
//LIMIT_INIT_MESH(maxMeshOutputPerVertexGranularity);
//LIMIT_INIT_MESH(maxMeshOutputPerPrimitiveGranularity);

//for(uint8_t i = 0; i < 3; i++){
// LIMIT_INIT_MESH(maxTaskWorkGroupCount[i]);
// LIMIT_INIT_MESH(maxTaskWorkGroupSize[i]);
// LIMIT_INIT_MESH(maxMeshWorkGroupCount[i]);
// LIMIT_INIT_MESH(maxMeshWorkGroupSize[i]);
//}
//#undef LIMIT_INIT_MESH
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Erfan-Ahmadi review and advise on how to expose

Comment on lines +29 to +38
inline const SCachedCreationParams& getCachedCreationParams() const
{
return pipeline_base_t::getCachedCreationParams();
}

inline SCachedCreationParams& getCachedCreationParams()
{
assert(isMutable());
return m_params;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the creation params are identical for Gfx and MEsh, should go in ICPURasterizationPipeline

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

99168b9#diff-45681900933e70545ee1a41c9ad2419f9cd40ef479ad7afe82dc97f29095c36aR13.
If Programmable Vertex Pulling is enforced, VertexParams could be removed from GraphicsPipeline. With preprocessor branching, primitive type could potentially be moved out the shader and into the creation params, allowng the creation params to be the same for both rasterization pipelines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take that back, I only thought about outputtopology. Preprocessor branching the rest of the shader per primitive type is unreasonable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use composition ot inheritance to split the SCachedCreationParams.

we will support vertex inputs as long as we support graphics pipelines.

namespace nbl::asset
{

class ICPUMeshPipeline final : public ICPUPipeline<IMeshPipeline<ICPUPipelineLayout,ICPURenderpass>>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make a ICPURasterizationPipeline : IRasterizationPipeline<ICPUPipelineLayout,ICPUrenderpass>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried IRasterization earlier.
I took another look. I don't use polymorphism too much, so maybe there's an easy answer staring me in the face. Most of the issue is that IGPURasterization is an incomplete type unless it's specialized with IMesh or IGraphics, at which point it's not an effective abstraction.
image
Here is perhaps the simplest solution.
image

Another potential solution I have is removing IGraphics and IMesh and replacing with IRasterization. However, that introduces a new problem, the construction parameters would be above the IGPU/ICPU abstraction level, unless some virtual functions are used.
image

If none of these solutions are acceptable, my recommendation would be a larger rewrite of the Pipeline system as whole. I have ideas here, but I'll hold until further direction is given.

Comment on lines +18 to +19
https://registry.khronos.org/vulkan/specs/latest/man/html/VkGraphicsPipelineCreateInfo.html#VUID-VkGraphicsPipelineCreateInfo-renderPass-07064
* If renderPass is not VK_NULL_HANDLE, the pipeline is being created with pre-rasterization shader state, subpass viewMask is not 0, and multiviewMeshShader is not enabled, then pStages must not include a mesh shader

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can check that

https://registry.khronos.org/vulkan/specs/latest/man/html/VkGraphicsPipelineCreateInfo.html#VUID-VkGraphicsPipelineCreateInfo-pDynamicStates-07066
* If the pipeline requires pre-rasterization shader state, and includes a mesh shader, there must be no element of the
* pDynamicStates member of pDynamicState set to VK_DYNAMIC_STATE_PRIMITIVE_RESTART_ENABLE, or VK_DYNAMIC_STATE_PATCH_CONTROL_POINTS_EXT

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we choose the dynamic state ourselves (opinionated pick), and I think we never made those two dynamic, ever

maybe check the PRIMITIVE_RESTART enable

Comment on lines +43 to +46
https://registry.khronos.org/vulkan/specs/latest/man/html/VkGraphicsPipelineCreateInfo.html#VUID-VkGraphicsPipelineCreateInfo-renderPass-07720
* If renderPass is VK_NULL_HANDLE, the pipeline is being created with pre-rasterization shader state, and
* VkPipelineRenderingCreateInfo::viewMask is not 0, and multiviewMeshShader is not enabled, then pStages must not include a mesh shader

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what its saying is that you can't use multivew in pipeline create info with a mesh shader if you don't have the multiviewMeshShader feature

Basically there's the old multivew feature, but that only controls if multiview is available for a graphics pipeline, for mesh you have another thing to check instead

if (!processSpecInfo(fragmentShader, hlsl::ShaderStage::ESS_FRAGMENT)) return {};

if (!hasRequiredStages(stagePresence))
return {};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you haven't committed the IMeshPipeline.h file I can't see the impl of this

there will be another commit.
I renamed meshgraphicscommon to rasterizationpipelinecommon
i added mesh pipelines to cassetconverter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants